NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Soil Carbon Mapping of the Contiguous US Using VNIR Spectra Within A Heterogeneous Spatial Model

https://doi.org/10.1007/s13253-025-00679-5

Parker, Paul_A; Sansó, Bruno (February 2025, Journal of Agricultural, Biological and Environmental Statistics)

Abstract The Rapid Carbon Assessment, conducted by the US Department of Agriculture, was implemented in order to obtain a representative sample of soil organic carbon across the contiguous US. In conjunction with a statistical model, the dataset allows for mapping of soil carbon prediction across the US; however, there are two primary challenges to such an effort. First, there exists a large degree of heterogeneity in the data, whereby both the first and second moments of the data generating process seem to vary both spatially and for different land-use categories. Second, the majority of the sampled locations do not actually have laboratory-measured values for soil organic carbon. Rather, visible and near-infrared (VNIR) spectra were measured at most locations, which act as a proxy to help predict carbon content. Thus, we develop a heterogeneous model to analyze this data that allows both the mean and the variance to vary as a function of space as well as land-use category, while incorporating VNIR spectra as covariates. After a cross-validation study that establishes the effectiveness of the model, we construct a complete map of soil organic carbon for the contiguous US along with uncertainty quantification.
more » « less
Bayesian Non-Parametric Inference for Multivariate Peaks-over-Threshold Models

https://doi.org/10.3390/e26040335

Trubey, Peter; Sansó, Bruno (April 2024, Entropy)

We consider a constructive definition of the multivariate Pareto that factorizes the random vector into a radial component and an independent angular component. The former follows a univariate Pareto distribution, and the latter is defined on the surface of the positive orthant of the infinity norm unit hypercube. We propose a method for inferring the distribution of the angular component by identifying its support as the limit of the positive orthant of the unit p-norm spheres and introduce a projected gamma family of distributions defined through the normalization of a vector of independent random gammas to the space. This serves to construct a flexible family of distributions obtained as a Dirichlet process mixture of projected gammas. For model assessment, we discuss scoring methods appropriate to distributions on the unit hypercube. In particular, working with the energy score criterion, we develop a kernel metric that produces a proper scoring rule and presents a simulation study to compare different modeling choices using the proposed metric. Using our approach, we describe the dependence structure of extreme values in the integrated vapor transport (IVT), data describing the flow of atmospheric moisture along the coast of California. We find clear but heterogeneous geographical dependence.
more » « less
Full Text Available
Multivariate nearest‐neighbors Gaussian processes with random covariance matrices

https://doi.org/10.1002/env.2839

Grenier, Isabelle; Sansó, Bruno; Matthews, Jessica L (May 2024, Environmetrics)

Abstract We propose a non‐stationary spatial model based on a normal‐inverse‐Wishart framework, conditioning on a set of nearest‐neighbors. The model, called nearest‐neighbor Gaussian process with random covariance matrices is developed for both univariate and multivariate spatial settings and allows for fully flexible covariance structures that impose no stationarity or isotropic restrictions. In addition, the model can handle duplicate observations and missing data. We consider an approach based on integrating out the spatial random effects that allows fast inference for the model parameters. We also consider a full hierarchical approach that leverages the sparse structures induced by the model to perform fast Monte Carlo computations. Strong computational efficiency is achieved by leveraging the adaptive localized structure of the model that allows for a high level of parallelization. We illustrate the performance of the model with univariate and bivariate simulations, as well as with observations from two stationary satellites consisting of albedo measurements.
more » « less
Full Text Available
Nearest-Neighbor Mixture Models for Non-Gaussian Spatial Processes

https://doi.org/10.1214/23-BA1405

Zheng, Xiaotian; Kottas, Athanasios; Sansó, Bruno (December 2023, Bayesian Analysis)

Full Text Available
Bayesian geostatistical modeling for discrete‐valued processes

https://doi.org/10.1002/env.2805

Zheng, Xiaotian; Kottas, Athanasios; Sansó, Bruno (April 2023, Environmetrics)

Full Text Available
Comparing emulation methods for a high‐resolution storm surge model

https://doi.org/10.1002/env.2796

Hutchings, Grant; Sansó, Bruno; Gattiker, James; Francom, Devin; Pasqualini, Donatella (May 2023, Environmetrics)

Full Text Available
Fast inference for time-varying quantiles via flexible dynamic models with application to the characterization of atmospheric rivers

https://doi.org/10.1214/21-AOAS1497

Barata, Raquel; Prado, Raquel; Sansó, Bruno (March 2022, The Annals of Applied Statistics)

Full Text Available
On Construction and Estimation of Stationary Mixture Transition Distribution Models

https://doi.org/10.1080/10618600.2021.1981342

Zheng, Xiaotian; Kottas, Athanasios; Sansó, Bruno (January 2022, Journal of Computational and Graphical Statistics)

Full Text Available
Distributed nearest-neighbor Gaussian processes

https://doi.org/10.1080/03610918.2021.1921798

Grenier, Isabelle; Sansó, Bruno (April 2021, Communications in Statistics - Simulation and Computation)
null (Ed.)
While many statistical approaches have tackled the problem of large spa- tial datasets, the issues arising from costly data movement and data stor- age have long been set aside. Having easy access to the data has been taken for granted and is now becoming an important bottleneck in the performance of statistical inference. As the availability of high resolution spatial data continues to grow, the need to develop efficient modeling techniques that leverage multi-processor and multi-storage capabilities is becoming a priority. To that end, the development of a distributed method to implement Nearest-Neighbor Gaussian Process (NNGP) models for spa- tial interpolation and inference for large datasets is of interest. The pro- posed framework retains the exact implementation of the NNGP while allowing for distributed or sequential computation of the posterior infer- ence. The method allows for any choice of grouping of the data whether it is at random or by region. As a result of this new method, the NNGP model can be implemented with an even split of the computation burden with minimum overload at the master node level.
more » « less
Full Text Available

Search for: All records